Metrics and norms used for obtaining sparse solutions 
to underdetermined Systems of Linear Equations 

Leoni Dalla and George K. Papageorgiou 
CO February 28, 2013 



O 

< 
(N 



Abstract 

This paper focuses on defining a measure, appropriate for obtaining optimally 
sparse solutions to underdetermined systems of linear equationsjj The general idea 
is the extension of metrics in n-dimensional spaces via the Cartesian product of metric 
spaces. 



u 

Q 1 Introduction 

Cd In general topology, mathematicians have long ago defined measures that had seen mini- 

S mum usage (if not at all) in applications. Later on, the development of Measure Theory 

was mandatory for the progression of applied mathematics and other sciences too. Along 

CN with the progress in computer sciences came the demand defining measures of unusual 

Q^ nature and uncovering the properties they obey. 

C^ In signal (image or sound) processing a usual problem that arises, is how to transfer 

in 

^, 

en 



a signal using a sparse (economical, but sufficient) representation |2], |3]. Given a specific 

matrix A of dimension m x n, m, tt, e N with m < n (underdetermined) and a vector b, 

O fi^id among all, the sparsest or (a less sparse) solution x G M" of the linear system Ax = b. 

^ This is the simplest form of the problem, which means that the noise of the signal is not 

r* included (noiseless problem). 

.^ Since an undefined system of linear equations has infinite number of solutions, they 

rN need to be filtered, using additional functions, in order to obtain solutions of a certain type 

Cd according to specific criteria. Functions that measure "energy" , like the I2 norm, are used 

in many occasions, yet measuring sparsity requires a measure of "sparsity", i.e. a different 

function |2l|3]. 

The optimization task is minimizing the number of nonzero coordinates of a vector in 
ra-dimensions, i.e. finding a sparse representative of the signal. The number of nonzero 

^The following work done, was within the completion of my master thesis titled "Algorithms for the 
computation of sparse solutions of undefined systems of equations" at the department of Mathematics, 
University of Athens which was assigned to me in association with the department of Informatics and 
Telecommunications, National and Kapodistrian University of Athens. 
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coordinates of a vector x is known to be the number of elements included in the set of 
nonzero values of a vector, which is called support of the vector, i.e. supp{x.}. Also, 
in recent work of Donoho and Elad the measure was "used" under the symbol of norm 
||x||o = #{« : Xi 7^ 0}, but it is clear that it does not satisfy the norm properties [21 E]- 

In the following paper, we begin posing some examples in everyday life, where different 
measures are needed in order to figure out distances. After a short review on metric 
spaces follows the definition of p-metrics in a Cartesian product space. The next section 
is of main interest, since we equip M" with metrics and prove that the discrete metric 
could be obtained as a limit of a p-metric |2j. In addition, follows a review on norms and 
the correlation between norms and metrics. Finally, we conclude with a comparison of 
functions, on the quest for a convex one, suitable for optimization tasks. 



2 Measures in everyday life 

In everyday life people subconsciously use measures in order to figure out distances, albeit 
those measures are not always well defined. Given an arbitrary set of points, a matter of 
most concern is to measure the distances between those points. However, the measure that 
we use in every problem is different and depends on the scale we would like to use, as well 
as the structure of the setting. 

The distance between Athens and New York is measured using the geodesic line between 
those points, i.e. the shortest route between two points on Earth's surface. (Fig. [I]). 




Figure 1: Geodesic distance between Athens - New York: 7920 km 



In case of a road trip, the travel distance depends on the road's structure and does not 
coincide with shortest distance between those two points (towns) (Fig. |2]). 

Furthermore, the existence of distances that differ from our perception of the shortest 
path cannot pass unnoticed. The distance that a person has to travel in the area of 




Figure 2: Travel distance between Atliens - Thessaloniki: 502 km 




Figure 3: Distance in the area of Manhattan, borough of New York City. 



Manhattan (borough of New York City) in order to move from Times Square to the junction 
of 57th Street with 9th Avenue depends on the structure of the setting (Fig. |3|. 

Another measure, used in order to define distances between compact sets, e.g. the 
distance between two islands, is the Hausdorff distance (named after Fehx Hausdorff) 
between the whole sets K and A defined 

h{K, A) = TTiaxImax min Ik — al, maxmin Ik — a\}. 

The latter represents, e.g. the minimum distance one has to travel in order to move from 
any village of the island of Andros (or Kefallonia) to any village of the island of Kefallonia 
(or Andros) (Fig. 111). 




Figure 4: Distance between Kefallonia - Andros 

So far, we have seen cases where the concept of distance needs to be mathematically 
defined in order to understand, develop and solve problems arising from very different 
settings. Hence, we should define the means needed in order to measure in a wide variety 
of cases. 



3 Metric spaces 

Definition 1 Let X be an arbitrary nonempty set. Metricr](or distance) in X, is a map 
p : X X X — > M obeying the following properties: 

1. p(x, y) > 0, Vx, y G X and p{x, y) = 4^ x = y 

2. p{x,y) = p{y,x), Wx,y G X, (Symmetric property) 



^Symbolize d, p or a. 



3. p(x, y) < p{x, z) + p{z, y), Vx, y, z ^ X, (Triangular inequality) 

The elements of the set are called points, the real nonnegative number p{x, y) is called 
the distance between x,y & X and the pair (X, p) metric space. 

Consequently, a set equipped with a metric, automatically obtains the structure of a topo- 
logical space qI We now define the open and the closed ball of center xq ^ X and radius 
r > 0, notions necessary for the topological description of a metric space. 

Definition 2 Let {X,d) be a metric space, Xq G X and r > 0. The set Sd{xo,r) = {x G 
X : d{x, Xq) < r} is called an open ball of center xq and radius r. 

Definition 3 Let {X,d) be a metric space, xq G X and r > 0. The set Sd{xo,r) = {x G 
X : d{x,xo) < r} is called a closed ball of center xq and radius r. 

Definition 4 The set A C X is called an open set, if for every a & A there exists r > 
such that Sd{o:,r) C A. 

Definition 5 The set 5 C X is called a closed set, if its complement X \ i? is an open 

set. 

Definition 6 The set F C X is bounded if there exists Xq G X and r > 0, such that 
r C Sdixo,r). 

Examples of Metric spaces: 

• The most common metrics to use in M" are di, d2 between its points x 

y = (2/l,2/2,---,2/n)- 



The metric di (Manhattan metric) in M" is defined as 

n 

<^i(x,y) = ^\xi -y^\. 



i=l 



Thus, in M^: (ii(x, y) = |xi — yi\ + |x2 — 2/2! and the closed ball is respectively 
SdA'^.r) = {^e^^ :rfi(x,0) <r}, (Fig.|5|-i), with r = 1 ). 



The metric d2 (Euclidean metric) in W^ is defined as 

n 

c?2(x,y) = [Y^{xi-y, 



1 

2\ 2 



j=l 



Thus, in M^: (i2(x, y) = ((xi — yi)"^ + (x2 — 2/2)^) ^ and the closed ball is respectively 
SdA'^^r) = {xgM^ :d2(x,0) < r}, (Fig. El- ii), with r = 1 ). 



^A topological space doesn't have to be a metric space. 



In every nonempty set X the metric ctq (discrete metric) between points x,y & X 
is defined as 

1, for X ^ y 

0, for X = y. 



M^^y) 



(1) 



Tlius, tlie closed ball is S'o-o(xo,'r) 
(a,/3)CM). 



{xo}, < r < 1 
X, r > 1 



(Fig. 



iii), with X 





iii) 



(b) 
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< r < 1 



So 

r > 1 



Figure 5: i) 5^,(0,1) = {x G M^ . di(x,0) < 1}. ii) The closed ball Sd^iO,!) = {x G M^ : 
(i2(x, 0) < 1} of the Euclidean metric in M? is the unit circle, iii) (a) For xq S X = (a, 5) C M 
and < r < 1, Sa^ixo, r) = {xq}. iii) (6) For xq G X = (a, 6) C M and r > 1 the closed ball is the 
entire set X, i.e. So-o{xQ,r) = (a, 6). 



3.1 Cartesian product space 

If the set X is arbitrary and not of a specific structure (e.g. vector space), the discrete 
metric ([I]) seems to be the only available choice. 

Given the metric spaces {Xi,pi), i = 1,2.. .,n we define the p-metrics {p > ijjin the 
Cartesian product X = Xi x X2 x ... x X„ for x = {xi, ...,Xn), y = (2/1, •••, Vn) £ ^'■ 



w(pi,P2,.-,Pn; p)\-^^y ) \ V 



1 
p\v 



for 1 < p < +00 



max{pi(xi,yi), 2 = 1,2, ...,n}, for p = +00 



(2) 



In case Xi = X2 



X„ = y, i.e. X = y" and pi = pa 



p„ = p, we denote 



d{p- p)(- , ■) instead of rf(pi,p2,...,p„; p)(- , ■) 



*For < p < 1 the triangular inequality does not hold, hence we do not define a metric. 



Metrics in (|2| are compatible with the ones that already exist in Xi, X2, ..., X„ according 
to the following sense. Let (yi, ...,y„) G X be an arbitrary fixed point. Coinciding Xj G Xj 
with {yi,...,Xi,...,yn) G X we have 

dip^,p2,...,p„;p){iyi,---,Xi,...,yn), {yi,...,x'i,...,yn)) =pi{xi,x'i), forp>l, 

where i is the index corresponding to the metric space (Xj,pj). 

3.2 MJ^ [n > 1) equipped with metrics 

Due to the discrete nature of computers, our main interest is the set X of the metric space 
to be a vectored space or subspace. In most of the applications the space that appears is 
M" or subsets of this space. The axiomatic foundation of the set M of real numbers, gives 
us the latitude to define the metric a\.\{x,y) = \x — y\, x, y eM., where | ■ | stands for the 
absolute value of a real number. More generally we may take the metrics (Js{x, y) = \x — y\'^, 
for < s < 1 (o"i = (T|.|). At this point it is important to consider that 

lim as{x,y) = ao{x,y). (3) 

s->0+ 

In case of the set X C M x M x ... x M = M", emerge the p- metrics resulting from 
(M, cr|.|), (M, (Js) for < s < 1 and (M, <Jo) for points x = (xi,...,x„), y = (yi, ...,?/„) G M", 
respectively. Alternatively, a combination of a-metrics is also possible in order to measure 
in a different way among subsets of X, however the latter choice lacks in practice. 
Analytically we use the following metrics: 

Usual metrics in M": 

• Let M equipped with the metric a\.\ = \x — y\. It follows that M" is equipped with the 
metric c?(o. . p) which according to (|2|) leads to: 



1 
Er=i \^i - Vil^)"^ for 1 < p < +00 



max{\xi — yi\, i = 1, ...,n}, for p = +00 

Thus, for p = 2 we have the Euclidean metric in M" (Fig. [6]). 

Let M equipped with the metric as = \x — yl"^ for < s < L It follows that M" is 
equipped with the metric (i(o-^; p) that according to Q leads to: 



rf(..; ,)(x,y) = (^"- {^s{x.,y^)Yy^ for 1 < p < +00 ^3^ 

[max{o-s(xi, yi), i = 1, ..., n}, for p = +00 

Specifically, for p = 1 we have: 

n 

d{as; i)(x, y) = ^\xi -yi\', < s < 1. (6) 



i=l 




Figure 6: Sd, . p)(0)l) = {x G M : d^„ . p)(x, 0) < 1} for different values of p. Notice that 



while p increases, the ball of our space tends to be Sd, 
tends to be Sd, . ^. (0, 1). 



(o-i.i; +oo) 



1^0,1), whereas p decreases to 1, 



Discrete metric in M": 

Let M equipped with the metric ao{x,y) 



1 for X ^ V 

thus M" is equipped with the 



0, ioT X = y 



metric d(^a-o; p) that according to (^ results to: 

C^(ao;p)(x,y) = 

Hence considering the case p = 1 we have 



1 
\P\ p 



EILi K(a;», yi))^j ^ for 1 < p < +00 

max{cro(xi, yi), i = 1, ..., n}, for p = +oo 



(7) 



d{ao; i)(x, y) = #{« : Xi ^ yi}. (8) 

As (i(a.3; 1) is of most importance in sparse theory, we denote it as dg, if not to be confused 
with any other metric and use the symbolism 5*^^ for the closed ball respectively. Finally, 
combining equation ([s]), with both (|3| and ^ we obtain: 



n 

lim 4(x,y) = lim ( V" \xi - yi\n = #{z : Xi 7^ y^} = cio(x,y). 



(9) 



The final equation indicates the behaviour of closed balls. In (Fig. uh it can be easily 
seen that in M^ and for r G (0, 1) (r = 0.5) the balls Ss{0, r) decrease, i.e. for < s' < s < 1 
we have 5*5/(0, r) C 5*^(0, r) and finally tend to be the ball 6*0(0, r) = f] 5*5(0, r). For 

0<s<l 

r G [1, 2) (r = 1.5) a relation of subset does not exist between the balls Ss'{0, r) and 5^(0, r) 
for s' < s, however 5*^(0, r) decrease and tends to coincide with the axis while s — )> 0"*", i.e. 



the ball 5*0(0, r). For r G [2, +00) (r = 3) the balls increase and for < s' < s < 1 we have 
Ss'{0,r) D Ss{0,r) until they finally fill the whole space, while S'o(0,r) = |J 5*^(0, r). 

0<s<l 

Equation ^ indicates a desirable measure of sparsity, defined as 

t/o(x,0) = #{z:x, ^0}, (10) 

measures the number of nonzero coordinateqj of a vector and belongs to the family of 
metrics. 

Alternative metrics in M": 

Another measure constructed by a combination of different metrics ^ enables us to mea- 
sure each subset differently. At its simplest form we state an example in M^. Let the metric 

space (M, (To) x (M, cr|.|) and set x = (xi,X2), y = (1/1,2/2) £ I^^- 

/ \ i/p 

C?(ao,a|.|;p)(x,y) = (^(o-o(Xi,yi))P+ |x2-2/2rj , 1 < P < +00 

Thus, for p = 1: 

d^ao,a^.^; 1) (x, Y ) = ao{xl,yi) + \X2 - y2\ (H) 

Consequently, the closed ball of center and radius r are (Figjsl): 



S{0,r) = {^eR': ao(xi,0) + \x,\ < r} = { '" - ^^ - "j ^ '"' °' (12) 

1 — r < X2 < T — 1, Xi 7^ (J. 



^Also called support of a vector and denoted as supp{x}. 



i) S,{0,r), r = 0.5 



ii)S',(0,r), r = 1.5 



iii) Ss{0,r), r = 3 
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Figure 7: 5^(0, r) = {x =_{x,y) £ M^ : (j<,(x,0) < r}, < s < 1 : i) For < r < 1, 5o(0,r) 
{(0, 0)}. ii) For 1 < r < 2, 5o(0, r) = {x = (x, y) G M^ : j; = or y = 0}. iii) For r > 2, S'o(0, r) 
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i) r = 0.5 



1 y 



S{0,r) 



ii) r = 1 
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Figure 8: S'(0,r) = {x = {x,y) G M^ . ^Q(a;^o) + |y| < r} i) For < r = 0.5 < 1, S'(0,0.5) = 
{-1/2 < y <i/2, x = 0}. ii) For r = 1, 5(0, 1) = {-1 < y < 1, x = or y = 0, x / 0}. iii) For 
r = 1.5 > 1, S(0, 1.5) = {-3/2 < y < 3/2, x = or - 1/2 < y < 1/2, x / 0}. 

4 Normed spaces 

Definition 7 Vector (linear) space is called the trio {X, +, ■), where X is a nonempty 
set, + : X X X — > X an inner operation (addition) and ■ : Ff\x X — > X an outer 
operation (scalar product) that obey the following properties: 

1. X + y = y + X, Wx, y G X, 

2. {x + y) + z = X + {y + z), Vx, y,z E X, 

3. There exists E X such that x + = + x = x, Wx E X, 

4. For all X G X there exists —x E X such that x + (—x) = (—x) + a; = 0, 



^The field F ■ 
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5. \{x + y) = \x + \y, \/x, y E X and \ E F, 

6. (A + fi)x = \x + /ix, Vx G X and A, /i G F, 

7. X{fix) = {Xfi)x, Vx G X and A, fi E F, 

8. Ix = X, Vx G X. 

The elements of a vectored space are called vectors. 

Definition 8 Let X be a vector space over a field of numbers F. The set A C X is called 
convex, if for every pair x, y G X and every t G [0, 1], the element tx + (1 — t)y belongs 
to the set A as well. 

Definition 9 A real function / : A — > R defined over a convex subset of a linear space X 
is called convex, if for every x,y E A and t G [0, 1], 

f(tx + (l-t)y) <tfix) + il-t)fiy). 



Definition 10 A real function / : A — )■ M defined over a convex subset of a linear space 
X is called concave, if for every x,y E A and t G [0, 1], 

f{tx + il-t)y)>tfix) + il-t)fiy). 

Let M be the vector space. Thus, the absolute value obeys the following properties: 

1. |x| > 0, X G M and |x| = <^ x = 

2. |Ax| = |A||x|, X G M, A G M, (positive homogeneous) 

3. |x + i/|<|x| + ||/|, X, y eM., (triangular inequality) 

Therefore the function /(x) = |x| is positive homogeneous, convex and /(x) > for x 7^ 0. 
A norm is the generalization of the absolute value in higher-dimensional vector spaces. 

Definition 11 Let (X, +, ■) be a real vector space. The map || ■ || : X — > M is called 
norm if it obeys the following properties: 

1. ||x|| > 0, Vx G X and ||x|| = <^ x = 0, 

2. ||Ax|| = |A|||x||, Vx G X and A G M, (positive homogeneous) 

3. ||x + i/|| < ||x|| + ||y||, Vx,t/ G X, (triangular inequality) 
The pair (X, || ■ ||) is called a normed space. 

12 



It follows that f{x) = \\x\\ is also a positive homogeneous, convex function with f{x) > 
for X 7^ 0. It is not difficult to see that if ||x — 1/|| = f{x — y) = d{x,y) for x,y E X, then d is 
a metric in X with d{x, 0) = ||x||. Moreover, if p is a metric in a vector space X satisfying 
the additional properties p{x + z,y + z) = p{x, y), x,y E X and p{Xx, 0) = |A|p(a;, 0) with 
X E X, A G M (positive homogeneous), then p{x, 0) = ||a:|| is a norm. However, we will see 
that some of the metrics defined do not derive from norms. 

Likewise metric spaces X = Xi x X^ x ... x X„, the p— norms in (Xj, || ■ ||j) are defined. 

Examples of normed spaces for X = M": 

• The p— norms for 1 < p < +oo: 



The Euclidean norm (p = 2): 



P^ 2 



El-.!")' 



i=l 



i=\ 



n _ 1 



The addition norm (1— norm) || ||i and the norm || ||oo respectively: 

n 



\Xi 



i=\ 



||x||oo = ?Tiax{|xj| : % = 1, 2, ...,n}. 

At this point we should emphasise that I < p < +oo, so that || ■ 
could be easily proved using the Minkowski inequality. 

Because of the demand for an optimization function, we set 



is a norm in M", which 



/p(x) 



E 



p > 



(13) 



which is positive homogeneous for every p > 0, whilst convex only for p > 1. For < p < 1 
the function is partially concave, hence the triangular inequality is not satisfied (Fig. [9]). 
Metric as{x,y) = \x — ?/|*, x, y E M. is transposition invariant, though not positive 



homogeneous for s ^ 1. Hence, ^(o-^; p)(x, 0) for < s < 1 in M" are not norms (Fig. 10 for 
p = l). 

From another point of view, the geometric interpretation gives us a clear image of 
all above. For p > 1 the closed balls are convex sets, unlike for < p < 1 . Suppose 

(X, II ■ II) is a normed vector space, the balls 5'||.||(x,r) are always convex sets. Indeed, if 
y,z E S'||.||(x,r) then \\y — x\\, \\z — x\\ < r, thus for A E (0, 1), \\\y + (1 — A^;) — x|| = 



13 



\\\{y — x) + {1 — X){z — x)\\ < Ar + (1 — \)r < r, i.e. \y + {1 — \)z E 5*11.11 (x,r), hence the 
set 5*11.11 (x,r) is convex. 

Remark: The property of convexity is of great importance. Suppose that (X, || ■ ||) 
is a normed vector space and let i^ C X be a convex and symmetric {K = —K) open 
set, such that i?, r > exist and 5||.||(0,r) C fC C 5*11.11(0,/?). Thus K constitutes the unit 
ball of another norm, i.e. \\x\\k = inf{X > : x G XK} (Minkowski functional) which is 
topological equivalent to the initial. 



i) f(x,y)=(|xp+|y|2)i/2 



ii) f(x,y)=(|xp+|y|5)i/5 





y '" X 

iii) f(x,y)=(|x|i'2+|y|i/2)2 






iv) f(x,y)=(|x|i/5+|y|i'5) 



^ 




Figure 9: Function f[x, y) = (|a;|^ + |y|^) ^ for p > : i) - ii) For p > 1 those functions are convex, 
iii) - iv) For < p < 1 functions are not convex, i) - iv) All functions are positive homogeneous 
for all p > 0. 
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i) g(x,y)=|x|+|y| 



ii) g(x,y)=|xp+|y|2 




y X 

iii) g(x,y)=|xp+|y|5 



r-n/w, ';,i',' . '"^\'.^ 




V) g(x,y)=|x|i/5+|y|i/5 



y X 

iv) g(x,y)=|x|i/2+|yr'2 



it^ 




/ 



Figure 10: Function g{x,y) = \x\'^ + lyl'* for s > : i) - ii) For s > 1 the functions are convex, 
iv) - v) For < s < 1 functions are not convex, ii) - v) For s ^ 1 the functions are not positive 
homogeneous. 
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